Mortgage Probability of Default & Mortgage Fraud¶

Data Analyst: Frankie Ma

Introduction: Mortgage fraud and default pose significant risks to both lenders and borrowers in the U.S. housing market. With the rise of online services offering fake documentation and fraudulent verification, lenders face increasing difficulty in validating borrower income and intent. These schemes—often masked as novelty tools—can lead to occupancy misrepresentation, inflated credit profiles, and fraudulent loan approvals. As lenders like Fannie Mae and Freddie Mac tighten their risk protocols, the consequences of undetected fraud ripple through financial institutions, investors, and ultimately taxpayers. At the same time, mortgage defaults continue to trigger complex legal processes, including foreclosures, which carry steep financial and reputational costs. In this project, we apply machine learning models to assess default risk and explore how data-driven strategies can help lenders better detect anomalies, protect against fraudulent applications, and reduce financial losses across varying interest rate scenarios.

In [1]:
from IPython import display
display.Image("Mortgage.png")
Out[1]:
No description has been provided for this image

[This picture is generated by AI]

Project Goal: Our project aims to develop predictive models that identify high-risk mortgage loan applications by detecting potential defaults and fraudulent behavior. By leveraging machine learning and profit-based evaluation across different interest rate scenarios, we seek to help lenders make more informed, data-driven decisions that reduce financial exposure and enhance loan portfolio quality.

Table of Contents¶

  • Section 1 Anomaly Detection through Feature Engineering
    • 1.1 Target Encoding
    • 1.2 Imputing Missing Values
  • Section 2 Model Building
    • 2.1 Random Forest
      • 2.1.1 Model Performance
      • 2.1.2 Feature Importance
    • 2.2 Top 10 Features Random Forest
  • Section 3 SHAP
    • 3.1 Fit Explainer
    • 3.2 SHAP Plots
      • 3.2.1 Bar Plot for Feature Importance
      • 3.2.2 Summary Plot
      • 3.2.3 Waterfall Plot
        • 3.2.3.1 Observation 1
        • 3.2.3.2 Observation 2
        • 3.2.3.3 Observation 3
        • 3.2.3.4 Observation 4
      • 3.2.4 Force Plot
        • 3.2.4.1 Observation 1
        • 3.2.4.2 Observation 2
        • 3.2.4.3 Observation 3
        • 3.2.4.4 Observation 4
      • 3.2.5 Dependence Plot
      • 3.2.6 Heatmap
      • 3.2.7 Decision Plot Comparison
  • Section 4 Summary & Conclusions
In [2]:
# Before we get started, let's load all the packages we are going to use in this project.
# data
import pandas as pd
import numpy as np

# visualization
import matplotlib as mpl
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
#import missingno as msno
import plotly.express as px
import plotly.figure_factory as ff
import plotly.graph_objects as go
#from wordcloud import WordCloud

from sklearn import datasets
from sklearn.tree import plot_tree
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# styling
%matplotlib inline
sns.set_style('darkgrid')
mpl.rcParams['font.size'] = 12
mpl.rcParams['figure.facecolor'] = '#00000000'
mpl.rcParams['font.size'] = 12
mpl.rcParams['figure.facecolor'] = '#00000000'

import os 
#from wordcloud import WordCloud

import warnings
warnings.filterwarnings("ignore")

import shap

Section 1 Anomaly Detection through Feature Engineering ¶

In [3]:
df = pd.read_csv('XYZloan_default_llm.csv')
df.head()
Out[3]:
Unnamed: 0.1 Unnamed: 0 AP001 AP002 AP003 AP006 AP007 AP008 CR004 CR009 ... TD005 TD006 TD009 TD010 TD013 TD014 TD022 TD024 loan_default reason
0 4 76031 33 1 3 h5 4 3 4 63100 ... 4 1 4 1 4 1 10.0 0.0 1 I’d really appreciate if we could move faster ...
1 5 23312 34 1 3 h5 5 5 3 53370 ... 3 1 6 2 7 2 15.0 10.0 1 We’re trying to align the closing date with a ...
2 9 66033 36 2 1 ios 2 2 3 5400 ... 4 2 4 2 5 2 25.0 0.0 1 It would really help to close this week so I c...
3 10 41847 28 1 1 ios 5 5 3 2000 ... 4 4 7 4 7 4 25.0 6.0 1 There are some logistics around my move that m...
4 13 28275 35 2 4 h5 3 3 4 27704 ... 4 1 4 1 7 1 25.0 0.0 1 I’d like to close by Friday if possible—the se...

5 rows × 32 columns

In [4]:
data = df.drop(columns=['Unnamed: 0.1', 'Unnamed: 0', 'reason'])

1.1 Target Encoding ¶

In [5]:
# specify categorical & numeric data type
cat_var = ['AP006', 'MB007']
num_var = ['AP001', 'AP002', 'AP003', 'AP007', 
           'AP008', 'CR004', 'CR009', 'CR015', 'CR017', 'CR018', 'CR019',
           'MB005', 'PA022', 'PA023', 'PA028', 'PA029', 'PA031', 'TD001',
           'TD005', 'TD006', 'TD009', 'TD010', 'TD013', 'TD014', 'TD022', 'TD024']
X_vars = cat_var + num_var
target = 'loan_default'
data[target].value_counts()
Out[5]:
loan_default
0    12924
1     3076
Name: count, dtype: int64

1.1.1 Data Spliting ¶

In [6]:
X = data[X_vars]
y = data[target]

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42)

1.2 Imputing Missing Values ¶

In [7]:
X_train_numvar = X_train[num_var]
missing_columns = X_train_numvar.columns[X_train_numvar.isnull().sum() > 0]

# Display the columns with missing values
missing_columns
Out[7]:
Index(['MB005', 'PA022', 'PA023', 'PA028', 'PA029', 'PA031', 'TD022', 'TD024'], dtype='object')
In [8]:
for col in missing_columns:
    mean_value = X_train[col].mean()
    # Impute the missing values in both X_train and X_test
    X_train[col].fillna(mean_value, inplace=True)
    X_test[col].fillna(mean_value, inplace=True)
In [9]:
from sklearn.impute import SimpleImputer
from sklearn.preprocessing import OneHotEncoder
num_imputer = SimpleImputer(strategy="mean")
cat_imputer = SimpleImputer(strategy="most_frequent")
ohe = OneHotEncoder(drop="first", handle_unknown="ignore", sparse_output=False)
In [10]:
X_train_num = pd.DataFrame(num_imputer.fit_transform(X_train[num_var]),
                           columns=num_var, index=X_train.index)
X_test_num = pd.DataFrame(num_imputer.transform(X_test[num_var]),
                          columns=num_var, index=X_test.index)
In [11]:
X_train_cat_imp = pd.DataFrame(cat_imputer.fit_transform(X_train[cat_var]),
                               columns=cat_var, index=X_train.index)
X_test_cat_imp = pd.DataFrame(cat_imputer.transform(X_test[cat_var]),
                              columns=cat_var, index=X_test.index)

X_train_cat_ohe = pd.DataFrame(
    ohe.fit_transform(X_train_cat_imp),
    index=X_train.index,
    columns=ohe.get_feature_names_out(cat_var)
)
X_test_cat_ohe = pd.DataFrame(
    ohe.transform(X_test_cat_imp),
    index=X_test.index,
    columns=ohe.get_feature_names_out(cat_var)
)

Section 2 Model Building ¶

The categorical columns are in X_train_dummy while the numerical columns are filled with mean values in X_train[num_var]. To model the data for following analysis, we will combine the X_train_dummy and X_train[num_var] (similar for test data).

In [12]:
X_train_enc = pd.concat([X_train_num, X_train_cat_ohe], axis=1)
X_test_enc = pd.concat([X_test_num, X_test_cat_ohe], axis=1)
X_train_enc = X_train_enc.replace([np.inf, -np.inf], np.nan).fillna(0.0)
X_test_enc = X_test_enc.replace([np.inf, -np.inf], np.nan).fillna(0.0)

feat_names_all = X_train_enc.columns.tolist()

2.1 Random Forest ¶

In [13]:
from sklearn.ensemble import RandomForestClassifier
rf = RandomForestClassifier(
    n_estimators=800,
    max_depth=None,            
    min_samples_split=5,
    min_samples_leaf=2,
    max_features='sqrt',  
    bootstrap=True,
    class_weight="balanced",   
    random_state=42,
    n_jobs=-1
)

rf.fit(X_train_enc, y_train)
Out[13]:
RandomForestClassifier(class_weight='balanced', min_samples_leaf=2,
                       min_samples_split=5, n_estimators=800, n_jobs=-1,
                       random_state=42)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
RandomForestClassifier(class_weight='balanced', min_samples_leaf=2,
                       min_samples_split=5, n_estimators=800, n_jobs=-1,
                       random_state=42)

2.1.1 Model Performance ¶

In [14]:
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, roc_auc_score, roc_curve, confusion_matrix
import matplotlib.pyplot as plt

# Predictions
y_pred = rf.predict(X_test_enc)
y_pred_proba = rf.predict_proba(X_test_enc)[:, 1]  # Probabilities for ROC/AUC

# Metrics
accuracy = accuracy_score(y_test, y_pred)
precision = precision_score(y_test, y_pred)
recall = recall_score(y_test, y_pred)
f1 = f1_score(y_test, y_pred)
roc_auc = roc_auc_score(y_test, y_pred_proba)

print("Accuracy:", round(accuracy, 4))
print("Precision:", round(precision, 4))
print("Recall:", round(recall, 4))
print("F1 Score:", round(f1, 4))
print("ROC AUC:", round(roc_auc, 4))

# Confusion matrix
cm = confusion_matrix(y_test, y_pred)
print("\nConfusion Matrix:\n", cm)

# ROC Curve
fpr, tpr, thresholds = roc_curve(y_test, y_pred_proba)
plt.figure(figsize=(6, 5))
plt.plot(fpr, tpr, label=f"AUC = {roc_auc:.3f}")
plt.plot([0, 1], [0, 1], linestyle="--", color="gray")
plt.xlabel("False Positive Rate")
plt.ylabel("True Positive Rate")
plt.title("ROC Curve")
plt.legend()
plt.show()
Accuracy: 0.8125
Precision: 0.4426
Recall: 0.0455
F1 Score: 0.0826
ROC AUC: 0.6652

Confusion Matrix:
 [[2573   34]
 [ 566   27]]
No description has been provided for this image

2.1.2 Feature Importance ¶

After obtaining that best random forest model, we are going to find our top 10 important features.

In [15]:
imp_series = pd.Series(rf.feature_importances_, index=feat_names_all).sort_values(ascending=False)
top10_feats = imp_series.head(10).index.tolist()
print("\nTop-10 features:")
print(imp_series.head(10))
Top-10 features:
CR009    0.071677
AP001    0.059917
MB005    0.054519
TD013    0.052974
TD009    0.046844
CR019    0.041405
TD005    0.041022
TD024    0.040452
CR018    0.039471
TD014    0.035555
dtype: float64
In [16]:
X_train_top = X_train_enc[top10_feats]
X_test_top = X_test_enc[top10_feats]

rf_top = RandomForestClassifier(
    n_estimators=500,
    max_depth=6,
    min_samples_split=10,
    min_samples_leaf=5,
    max_features=0.5,
    bootstrap=True,
    class_weight="balanced",
    random_state=42,
    n_jobs=-1
)
rf_top.fit(X_train_top, y_train)
Out[16]:
RandomForestClassifier(class_weight='balanced', max_depth=6, max_features=0.5,
                       min_samples_leaf=5, min_samples_split=10,
                       n_estimators=500, n_jobs=-1, random_state=42)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
RandomForestClassifier(class_weight='balanced', max_depth=6, max_features=0.5,
                       min_samples_leaf=5, min_samples_split=10,
                       n_estimators=500, n_jobs=-1, random_state=42)
In [17]:
auc_top = roc_auc_score(y_test, rf_top.predict_proba(X_test_top)[:, 1])
print(f"Top-10 RF ROC-AUC: {auc_top:.4f}")
Top-10 RF ROC-AUC: 0.6205

Section 3 SHAP ¶

Next, to compute SHAP values for the model, we need to create an Explainer object and use it to evaluate a sample or the full dataset:

3.1 Fit Explainer ¶

In [18]:
import shap
explainer = shap.TreeExplainer(rf_top)           
shap_values = explainer(X_test_top)
print("values shape:", shap_values.values.shape)
values shape: (3200, 10, 2)
In [19]:
sv_pos = shap_values[:, 1]                   
print("sv_pos shape:", sv_pos.values.shape)
sv_pos shape: (3200, 2)

3.2 SHAP Plots ¶

3.2.1 Bar Plot for Feature Importance ¶

In [20]:
sv_pos = shap.Explanation(
    values       = shap_values.values[:, :, 1],     # (n_samples, n_features)
    base_values  = shap_values.base_values[:, 1],   # (n_samples,)
    data         = X_test_top.values,               # match the same rows/cols
    feature_names= list(X_test_top.columns)
)

shap.plots.bar(sv_pos, max_display=10, clustering=None)
No description has been provided for this image

This SHAP bar plot shows the top 10 features driving the model’s predictions on average. Variables such as TD013, CR009, and TD009 have the strongest influence, each contributing around +0.03 to the prediction, while others like TD005 and MB005 also play important roles. The remaining features have smaller but still noticeable impacts, suggesting that a few key variables dominate the model’s decision-making process.

In [21]:
rows = [0, 1, 2, 3]                     
fig, axes = plt.subplots(2, 2, figsize=(20,12))

for ax, r in zip(axes.ravel(), rows):    
    plt.sca(ax)                      
    shap.plots.bar(sv_pos[r], show=False, max_display=10)
    ax.set_title(f"Observation {r}")

plt.tight_layout()
plt.show()
No description has been provided for this image

These SHAP bar plots illustrate the feature contributions for four individual observations. In each case, CR009 consistently has the strongest negative impact on the prediction, lowering the output significantly. Other features such as TD005, MB005, and TD024 occasionally provide small positive contributions, but their influence is much weaker compared to CR009, showing that this feature dominates the model’s decision for these instances.

3.2.2 Summary Plot ¶

In [22]:
shap.plots.beeswarm(sv_pos, max_display=10)
No description has been provided for this image

The SHAP summary plot shows how the top features influence the model’s predictions. Features like TD013, CR009, and TD009 have the strongest impact, with both positive and negative SHAP values, meaning they can either increase or decrease the prediction depending on their values. The color gradient shows that higher feature values (red) tend to push the prediction upward, while lower values (blue) generally push it downward, highlighting the directional effect of each feature.

3.2.3 Waterfall Plot ¶

3.2.3.1 Observation 1 ¶
In [23]:
# Observation 1
shap.plots.waterfall(sv_pos[0],max_display=10)
No description has been provided for this image

The waterfall plot of the first observation shows how individual features contributed to lowering the model output to 0.432, below the base value of 0.5. The largest negative impact came from CR009 (-0.08), followed by smaller negative contributions from CR019 and CR018. In contrast, features like TD024, TD005, TD009, and MB005 had small positive effects, slightly offsetting the drop but not enough to outweigh the strong downward pull from CR009.

3.2.3.2 Observation 2 ¶
In [24]:
# Observation 2
shap.plots.waterfall(sv_pos[1],max_display=10)
No description has been provided for this image

The waterfall plot of the second observation shows that the model prediction of 0.308 is well below the base value of 0.5. The largest negative drivers were TD013 (-0.09), TD009 (-0.08), and TD005 (-0.04), which strongly pulled the prediction downward. On the other hand, MB005 (+0.02) and smaller contributions from CR009 and CR018 (+0.01 each) slightly increased the score, but their impact was not enough to counteract the strong negative effects.

3.2.3.3 Observation 3 ¶
In [25]:
# Observation 3
shap.plots.waterfall(sv_pos[2],max_display=10)
No description has been provided for this image

This SHAP waterfall plot shows that the model predicted 0.445, slightly below the base value of 0.5. The main negative driver was CR009 (-0.06), which significantly lowered the score. In contrast, TD024 (+0.01) and MB005 (+0.01) provided small positive contributions, but these were outweighed by additional negative effects from CR019 (-0.01) and TD005 (-0.01), keeping the final prediction below the baseline.

3.2.3.4 Observation 4 ¶
In [26]:
# Observation 4
shap.plots.waterfall(sv_pos[3],max_display=10)
No description has been provided for this image

This SHAP waterfall plot shows that the model predicted 0.427, below the baseline of 0.5. The largest negative impact came from CR009 (-0.08), which strongly reduced the prediction. While TD005, MB005, and TD024 provided small positive contributions (+0.01 each), they were outweighed by additional negative effects from CR019 (-0.01) and TD024 (-0.01), leading to an overall lower prediction.

3.2.4 Force Plot ¶

3.2.4.1 Observation 1 ¶
In [27]:
row = 0  
shap.initjs()
shap.force_plot(
    base_value = sv_pos.base_values[row],
    shap_values = sv_pos.values[row, :],
    features = X_test_top.iloc[row, :],
    feature_names = sv_pos.feature_names
)
No description has been provided for this image
Out[27]:
Visualization omitted, Javascript library not loaded!
Have you run `initjs()` in this notebook? If this notebook was from another user you must also trust this notebook (File -> Trust notebook). If you are viewing this notebook on github the Javascript has been stripped for security. If you are using JupyterLab this error is because a JupyterLab extension has not yet been written.

This SHAP force plot shows that the model’s prediction is 0.43, lower than the baseline of ~0.50. Features MB005, TD009, TD005, and TD024 pushed the prediction upward, but their influence was outweighed by strong downward contributions from CR009, CR019, and CR018, with CR009 having the largest negative effect. Overall, the negative impacts dominated, lowering the final prediction below the baseline.

3.2.4.2 Observation 2 ¶
In [28]:
row = 1 
shap.initjs()
shap.force_plot(
    base_value = sv_pos.base_values[row],
    shap_values = sv_pos.values[row, :],
    features = X_test_top.iloc[row, :],
    feature_names = sv_pos.feature_names
)
No description has been provided for this image
Out[28]:
Visualization omitted, Javascript library not loaded!
Have you run `initjs()` in this notebook? If this notebook was from another user you must also trust this notebook (File -> Trust notebook). If you are viewing this notebook on github the Javascript has been stripped for security. If you are using JupyterLab this error is because a JupyterLab extension has not yet been written.

This SHAP force plot shows the model’s prediction of 0.31, which is significantly lower than the baseline of ~0.50. While CR009 and MB005 provided small upward pushes, their effect was minimal compared to the strong downward contributions from TD013, TD009, TD005, TD014, and TD024. These negative influences collectively drove the prediction well below the baseline.

3.2.4.3 Observation 3 ¶
In [29]:
row = 2  
shap.initjs()
shap.force_plot(
    base_value = sv_pos.base_values[row],
    shap_values = sv_pos.values[row, :],
    features = X_test_top.iloc[row, :],
    feature_names = sv_pos.feature_names
)
No description has been provided for this image
Out[29]:
Visualization omitted, Javascript library not loaded!
Have you run `initjs()` in this notebook? If this notebook was from another user you must also trust this notebook (File -> Trust notebook). If you are viewing this notebook on github the Javascript has been stripped for security. If you are using JupyterLab this error is because a JupyterLab extension has not yet been written.

This SHAP force plot shows a prediction of 0.45, slightly below the baseline of ~0.50. Features MB005 and TD024 increased the prediction (pushing it higher), but their influence was outweighed by strong downward contributions from CR009, CR019, TD005, and AP001, which collectively pulled the score lower. This balance of effects explains why the final prediction remained below the baseline.

3.2.4.4 Observation 4 ¶
In [30]:
row = 3  
shap.initjs()
shap.force_plot(
    base_value = sv_pos.base_values[row],
    shap_values = sv_pos.values[row, :],
    features = X_test_top.iloc[row, :],
    feature_names = sv_pos.feature_names
)
No description has been provided for this image
Out[30]:
Visualization omitted, Javascript library not loaded!
Have you run `initjs()` in this notebook? If this notebook was from another user you must also trust this notebook (File -> Trust notebook). If you are viewing this notebook on github the Javascript has been stripped for security. If you are using JupyterLab this error is because a JupyterLab extension has not yet been written.

This SHAP force plot shows a prediction of 0.43, which is below the baseline of ~0.50. Features like TD013, MB005, and TD005 pushed the prediction higher, but their influence was outweighed by strong downward contributions from CR009 and CR019, which significantly reduced the score. As a result, the final prediction leaned toward a lower risk outcome.

3.2.5 Dependence Plot ¶

In [31]:
top_idx = np.argsort(np.abs(sv_pos.values).mean(0))[::-1][:10]
top_feats = [sv_pos.feature_names[i] for i in top_idx]

shap.plots.scatter(sv_pos[:, top_feats[0]])                      
shap.plots.scatter(sv_pos[:, top_feats[0]], color=sv_pos[:, top_feats[1]])
No description has been provided for this image
No description has been provided for this image

These dependence plots show how TD013 influences the model prediction.

  • In the first graph, as TD013 increases, its SHAP value rises sharply from negative to strongly positive, peaking around 15–20 before flattening, meaning higher TD013 values increase the likelihood of a positive prediction.
  • In the second graph, we see the same trend but colored by CR009. Darker red points (higher CR009 values) are more concentrated where SHAP values are positive, suggesting that high TD013 combined with high CR009 amplifies the positive impact on the model’s output.

3.2.6 Heatmap ¶

In [32]:
shap.plots.heatmap(sv_pos[:, top_feats], instance_order=shap.Explanation.abs.mean(1))
No description has been provided for this image
Out[32]:
<Axes: xlabel='Instances'>

This summary plot shows the variation of SHAP values across all features and instances, indicating how each feature impacts the model’s predictions. TD013, CR009, and TD009 stand out with strong positive (red) and negative (blue) contributions, suggesting they are the most influential drivers of the model output. The alternating bands of red and blue highlight how the same feature can either push the prediction higher or lower depending on its value. In contrast, features like AP001 and CR019 show lighter and more neutral effects, meaning they have a relatively smaller influence on predictions.

3.2.7 Decision Plot Comparison ¶

In [33]:
rows = [0, 1, 2, 3]  # choose any 4 rows you like

fig, axes = plt.subplots(2, 2, figsize=(14, 8))
axes = axes.ravel()

for ax, r in zip(axes, rows):
    plt.sca(ax)
    shap.decision_plot(
        base_value     = sv_pos.base_values[r],
        shap_values    = sv_pos.values[r],
        features       = X_test_top.iloc[r, :],
        feature_names  = sv_pos.feature_names,
        show=False
    )
    ax.set_title(f"Observation {r}")

plt.tight_layout()
plt.show()
No description has been provided for this image

These decision plots illustrate how different features contribute to the model predictions for four individual observations. For Observations 0, 2, and 3, CR009 has the strongest negative impact, pulling the prediction downward, while smaller contributions from features like TD024, MB005, and CR019 slightly offset this effect. Observation 1, on the other hand, is mainly influenced by TD013 and TD009, both driving the prediction lower. Overall, the plots highlight that CR009 consistently dominates the prediction direction, while other features contribute in smaller, observation-specific ways.

3.3 Feature Importance Under SHAP ¶

In [34]:
from scipy.special import softmax

def print_feature_importances_shap_values(shap_values, features):
    '''
    Prints the feature importances based on SHAP values in an ordered way
    shap_values -> The SHAP values calculated from a shap.Explainer object
    features -> The name of the features, on the order presented to the explainer
    '''

    importances = []
    for i in range(shap_values.values.shape[1]):
        importances.append(np.mean(np.abs(shap_values.values[:, i])))

    importances_norm = softmax(importances)

    feature_importances = {fea: imp for imp, fea in zip(importances, features)}
    feature_importances_norm = {fea: imp for imp, fea in zip(importances_norm, features)}

    feature_importances = {k: v for k, v in sorted(feature_importances.items(), key = lambda item: item[1], reverse = True)}
    feature_importances_norm = {k: v for k, v in sorted(feature_importances_norm.items(), key = lambda item: item[1], reverse = True)}

    for k, v in feature_importances.items():
        print(f"{k} -> {v:.4f} (softmax = {feature_importances_norm[k]:,.4f})")  
In [35]:
print_feature_importances_shap_values(shap_values, top10_feats) 
TD013 -> 0.0319 (softmax = 0.1017)
CR009 -> 0.0269 (softmax = 0.1012)
TD009 -> 0.0253 (softmax = 0.1010)
TD005 -> 0.0176 (softmax = 0.1002)
MB005 -> 0.0160 (softmax = 0.1001)
TD014 -> 0.0090 (softmax = 0.0994)
TD024 -> 0.0069 (softmax = 0.0992)
CR018 -> 0.0068 (softmax = 0.0992)
CR019 -> 0.0058 (softmax = 0.0991)
AP001 -> 0.0050 (softmax = 0.0990)

The feature importance results show that TD013, CR009, and TD009 are the top three contributors, each with similar SHAP-based importance scores (~0.031–0.025) and softmax weights around 0.10. These features dominate the model’s decision-making, while the remaining features have much smaller influences, indicating they play only marginal roles in shaping predictions. This suggests the model’s behavior is strongly driven by a small set of key variables.

Section 4 Summary & Conclusions ¶

Summary

Our SHAP analysis highlights that the model’s predictions are heavily influenced by a small set of features, with TD013, CR009, and TD009 emerging as the most impactful drivers. Dependence and force plots show that increases in TD013 tend to push predictions upward, while CR009 generally exerts a strong negative pull. Interaction effects between these variables amplify their importance, as seen when high values of TD013 align with high CR009. Other features, such as TD005 and MB005, contribute moderately, while the remaining predictors play only marginal roles.

Conclusions

The findings suggest that lender risk assessment should focus closely on a handful of dominant indicators, particularly TD013, CR009, and TD009, as they shape most of the model’s predictive power. This concentration of influence implies that improving data quality and monitoring around these key variables could substantially enhance fraud detection and default prediction. At the same time, the relatively small impact of the other features indicates diminishing returns in expanding feature sets without addressing these core drivers. Overall, SHAP analysis not only confirms which variables matter most but also provides transparency into how they push individual predictions higher or lower—helping lenders adopt more targeted and interpretable risk management strategies.